89 research outputs found

    A hierarchic task-based programming model for distributed heterogeneous computing

    Get PDF
    Distributed computing platforms are evolving to heterogeneous ecosystems with Clusters, Grids and Clouds introducing in its computing nodes, processors with different core architectures, accelerators (i.e. GPUs, FPGAs), as well as different memories and storage devices in order to achieve better performance with lower energy consumption. As a consequence of this heterogeneity, programming applications for these distributed heterogeneous platforms becomes a complex task. Additionally to the complexity of developing an application for distributed platforms, developers must also deal now with the complexity of the different computing devices inside the node. In this article, we present a programming model that aims to facilitate the development and execution of applications in current and future distributed heterogeneous parallel architectures. This programming model is based on the hierarchical composition of the COMP Superscalar and Omp Superscalar programming models that allow developers to implement infrastructure-agnostic applications. The underlying runtime enables applications to adapt to the infrastructure without the need of maintaining different versions of the code. Our programming model proposal has been evaluated on real platforms, in terms of heterogeneous resource usage, performance and adaptation.This work has been supported by the European Commission through the Horizon 2020 Research and Innovation program under contract 687584 (TANGO project) by the Spanish Government under contract TIN2015-65316 and grant SEV-2015-0493 (Severo Ochoa Program) and by Generalitat de Catalunya under contracts 2014-SGR-1051 and 2014-SGR-1272.Peer ReviewedPostprint (author's final draft

    AutoParallel: A Python module for automatic parallelization and distributed execution of affine loop nests

    Get PDF
    The last improvements in programming languages, programming models, and frameworks have focused on abstracting the users from many programming issues. Among others, recent programming frameworks include simpler syntax, automatic memory management and garbage collection, which simplifies code re-usage through library packages, and easily configurable tools for deployment. For instance, Python has risen to the top of the list of the programming languages due to the simplicity of its syntax, while still achieving a good performance even being an interpreted language. Moreover, the community has helped to develop a large number of libraries and modules, tuning them to obtain great performance. However, there is still room for improvement when preventing users from dealing directly with distributed and parallel computing issues. This paper proposes and evaluates AutoParallel, a Python module to automatically find an appropriate task-based parallelization of affine loop nests to execute them in parallel in a distributed computing infrastructure. This parallelization can also include the building of data blocks to increase task granularity in order to achieve a good execution performance. Moreover, AutoParallel is based on sequential programming and only contains a small annotation in the form of a Python decorator so that anyone with little programming skills can scale up an application to hundreds of cores.Comment: Accepted to the 8th Workshop on Python for High-Performance and Scientific Computing (PyHPC 2018

    Transparent Orchestration of Task-based Parallel Applications in Containers Platforms

    Get PDF
    This paper presents a framework to easily build and execute parallel applications in container-based distributed computing platforms in a user-transparent way. The proposed framework is a combination of the COMP Superscalar (COMPSs) programming model and runtime, which provides a straightforward way to develop task-based parallel applications from sequential codes, and containers management platforms that ease the deployment of applications in computing environments (as Docker, Mesos or Singularity). This framework provides scientists and developers with an easy way to implement parallel distributed applications and deploy them in a one-click fashion. We have built a prototype which integrates COMPSs with different containers engines in different scenarios: i) a Docker cluster, ii) a Mesos cluster, and iii) Singularity in an HPC cluster. We have evaluated the overhead in the building phase, deployment and execution of two benchmark applications compared to a Cloud testbed based on KVM and OpenStack and to the usage of bare metal nodes. We have observed an important gain in comparison to cloud environments during the building and deployment phases. This enables better adaptation of resources with respect to the computational load. In contrast, we detected an extra overhead during the execution, which is mainly due to the multi-host Docker networking.This work is partly supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316 project, by the Generalitat de Catalunya under contracts 2014-SGR-1051 and 2014-SGR-1272, and by the European Union through the Horizon 2020 research and innovation program under grant 690116 (EUBra-BIGSEA Project). Results presented in this paper were obtained using the Chameleon testbed supported by the National Science Foundation.Peer ReviewedPostprint (author's final draft

    Towards Automatic Application Migration to Clouds

    Get PDF
    Porting applications to Clouds is one of the key challenges in software industry. The available approaches to perform this task are basically either services derived from alliances of major software vendors and Cloud providers focusing on their own products, or small platform providers focusing on the most popular software stacks. For migrating other types of software, the options are limited to infrastructure-as-a-Service (IaaS) solutions which require a lot of programming effort for adapting the software to a Cloud provider’s API. Moreover, if it must be deployed in different providers, new integration procedures must be designed and implemented which could be a nightmare. This paper presents a solution for facilitating the migration of any application to the cloud, inferring the most suitable deployment model for the application and automatically deploying it in the available Cloud providers

    Semantic resource management and interoperability between distributed computing platforms

    Get PDF
    Distributed Computing is the paradigm where the application execution is distributed across different computers connected by a communication network. Distributed Computing platforms have evolved very fast during the las decades: starting from Clusters, where a set of computers were working together in a single location; then evolving to the Grids, where computing resources are shared by different entities, creating a global computing infrastructure which is available to different user communities; and finally becoming in what is currently known as the Cloud, where computing and data resources are provided, on demand, in a very dynamic fashion, and following the Utility Computing model where you pay only for what you consume. Different types of companies and institutions are exploring the potential benefits of moving their IT services and applications to Cloud infrastructures, in order to decouple the management of computing resources from their core business process to become more productive. Nevertheless, migrating software to Clouds is not an easy task, since it requires a deep knowledge of the technology to decompose the application and the capabilities offered by providers and how to use them. Besides this complex deployment process, the current cloud market place has several providers offering resources with different capabilities, prices and quality, and each provider uses their own properties and APIs for describing and accessing their resources. Therefore, when customers want to execute an application in the providers' resources, they must understand the different providers' description, compare them and select the most suitable resources for their interests. Once the provider and resources have been selected, developers have to inter-operate with the different providers' interfaces to perform the application execution steps. To do all the mentioned steps, application developers have to deal with the design and implementation of complex integration procedures. This thesis presents several contributions to overcome the aforementioned problems by providing a platform that facilitates and automates the integration of applications in different providers' infrastructures lowering the barrier of adopting new distributed computing infrastructure such as Clouds. The achievement of this objective has been split in several parts. In the first part, we have studied how semantic web technologies are helping to describe applications and to automatically infer a model for deploying them in a distributed platform. Once the application deployment model has been inferred, the second step is finding the resources to deploy and execute the different application components. Regarding this topic, we have studied how semantic web technologies can be applied in the resource allocation problem. Once the different components have been allocated in the providers' resources, it is time to deploy and execute the application components on these resources by invoking a workflow of provider API calls. However, every provider defines their own management interfaces, so the workflow to perform the same actions is different depending on the selected provider. In this thesis, we propose a framework to automatically infer the workflow of provider interface calls required to perform any resource management tasks. In the last part of the thesis, we have studied how to introduce the benefits of software agents for coordinating the application management in distributed platforms. We propose a multi-agent system which is in charge of coordinating the different steps of the application deployment in a distributed way as well as monitoring the correct execution of the application in the computing resources. The different contributions have been validated with a prototype implementation and a set of use cases.La Computación Distribuida es un paradigma donde la ejecución de aplicaciones se distribuye entre diferentes computadores contados a través de una red de comunicación. Las plataformas de computación distribuida han evolucionado rápidamente durante las últimas décadas, empezando por los "Clusters", donde varios computadores están conectados por una red local; pasando por los "Grids", donde los recursos computacionales son compartidos por varias instituciones creando un red de computación global; llegando finalmente a lo que actualmente conocemos como "Clouds", donde nos podemos proveer de recursos de manera dinámica, bajo demanda y pagando solo por lo que consumimos. Actualmente, varias compañías están descubriendo los beneficios de mover sus aplicaciones a las infraestructuras Cloud, desacoplando la administración de los recursos computacionales de su "core business" para ser más productivos. Sin embargo migrar el software al Cloud no es una tarea fácil porque se requiere un conocimiento exhaustivo de la tecnología y como usar los servicios ofrecidos por los diferentes proveedores. Además cada proveedor ofrece recursos con diferentes capacidades, precios y calidades, con su propia interfaz para acceder a ellos. Por consiguiente, cuando un usuario quiere ejecutar una aplicación en el Cloud, debe entender que ofrece cada proveedor y como usarlo y una vez que ha elegido debe programar los diferentes pasos del despliegue de su aplicación. Si además se quieren usar varios proveedores o cambiar a otro, este proceso debe repetirse varias veces. Esta tesis presenta varias contribuciones para mitigar estos problemas diseñando una plataforma para facilitar y automatizar la integración de aplicaciones en los diferentes proveedores. Estas contribuciones se dividen en varias partes: Primero, el estudio de como las tecnologías semánticas pueden ayudar para describir aplicaciones y automáticamente inferir como se puede desplegar en un plataforma distribuida. Una vez obtenemos este modelo de despliegue, la segunda contribución nos presenta como estas mismas tecnologías pueden usarse para asignar las diferentes partes del despliegue de la aplicación a los recursos de los proveedores. Una vez sabemos la asignación, la siguiente contribución nos resuelve como se puede usar "AI planning" para encontrar la secuencia de servicios que se deben ejecutar para realizar el despliegue deseado. Finalmente, la última parte de la tesis, nos presenta como el despliegue y ejecuciones de las aplicaciones puede coordinarse por un sistema multi-agentes de una manera escalable y distribuida. Las diferentes contribuciones de la tesis han sido validadas mediante la implementación de prototipos y casos de uso

    Heterogeneous hierarchical workflow composition

    Get PDF
    Workflow systems promise scientists an automated end-to-end path from hypothesis to discovery. However, expecting any single workflow system to deliver such a wide range of capabilities is impractical. A more practical solution is to compose the end-to-end workflow from more than one system. With this goal in mind, the integration of task-based and in situ workflows is explored, where the result is a hierarchical heterogeneous workflow composed of subworkflows, with different levels of the hierarchy using different programming, execution, and data models. Materials science use cases demonstrate the advantages of such heterogeneous hierarchical workflow composition.This work is a collaboration between Argonne National Laboratory and the Barcelona Supercomputing Center within the Joint Laboratory for Extreme-Scale Computing. This research is supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under contract number DE-AC02- 06CH11357, program manager Laura Biven, and by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), by Generalitat de Catalunya (contract 2014-SGR-1051).Peer ReviewedPostprint (author's final draft

    Aplicación de un modelo de calidad para evaluar experiencias e-learning en el Espacio Europeo Universitario

    Get PDF
    Durante los últimos años, las comunidades universitarias han dedicado grandes esfuerzos para integrar nuevas tecnologías a fin de mejorar sus procesos de aprendizaje. En este sentido, la mayoría de las universidades europeas han incorporado plataformas de e-learning que sirven de apoyo y complementan el modelo clásico de enseñanza. Sin embargo, la utilización de estas plataformas no siempre es suficiente para mejorar los procesos de enseñanza y aprendizaje. Por este motivo, se hacen necesarios métodos de diseño y evaluación de procesos educativos basados en plataformas de e-learning que tratarán de garantizar los objetivos de aprendizaje fijados. Este trabajo está orientado a evaluar este tipo de experiencias y, más específicamente, a proporcionar un procedimiento que utilice un modelo de calidad y que sirva de guía en la evaluación de experiencias educativas apoyadas en plataformas de e-learning.Nowadays, Universities and other higher education institutions invest a lot of resources to integrate Information & Communications Technologies (ICT) in their learning processes. In particular, e-learning platforms have been incorporated to support the traditional campus-based activities. However, the application of these platforms is not enough to improve learning and teaching processes. Innovative methods are required to design and evaluate ICT enhanced learning experiences that contribute to change the traditional face-toface teaching in this context. This work is focused on evaluating this kind of experiences and more specifically, on providing procedures to guide their evaluation when e-learning platforms are used.Durant els últims anys, les comunitats universitàries han dedicat grans esforços a integrar noves tecnologies per millorar els seus processos d'aprenentatge. En aquest sentit, la majoria de les universitats europees han incorporat plataformes d'e-learning que serveixen de suport i complementen el model clàssic d'ensenyament. Tanmateix, la utilització d'aquestes plataformes no sempre és suficient per millorar els processos d'ensenyament i aprenentatge. Tanmateix, amb la utilització d'aquestes plataformes no sempre n'hi ha prou per millorar els processos d'ensenyament i aprenentatge. Per aquest motiu, es fan necessaris mètodes de disseny i avaluació de processos educatius basats en plataformes d'e-learning que tractaran de garantir els objectius d'aprenentatge fixats. Aquest treball està orientat a avaluar aquest tipus d'experiències i, més específicament, a proporcionar un procediment que utilitzi un model de qualitat i que serveixi de guia en l'avaluació d'experiències educatives basades en plataformes d'e-learning

    Dynamic energy-aware scheduling for parallel task-based application in cloud computing

    Get PDF
    Green Computing is a recent trend in computer science, which tries to reduce the energy consumption and carbon footprint produced by computers on distributed platforms such as clusters, grids, and clouds. Traditional scheduling solutions attempt to minimize processing times without taking into account the energetic cost. One of the methods for reducing energy consumption is providing scheduling policies in order to allocate tasks on specific resources that impact over the processing times and energy consumption. In this paper, we propose a real-time dynamic scheduling system to execute efficiently task-based applications on distributed computing platforms in order to minimize the energy consumption. Scheduling tasks on multiprocessors is a well known NP-hard problem and optimal solution of these problems is not feasible, we present a polynomial-time algorithm that combines a set of heuristic rules and a resource allocation technique in order to get good solutions on an affordable time scale. The proposed algorithm minimizes a multi-objective function which combines the energy-consumption and execution time according to the energy-performance importance factor provided by the resource provider or user, also taking into account sequence-dependent setup times between tasks, setup times and down times for virtual machines (VM) and energy profiles for different architectures. A prototype implementation of the scheduler has been tested with different kinds of DAG generated at random as well as on real task-based COMPSs applications. We have tested the system with different size instances and importance factors, and we have evaluated which combination provides a better solution and energy savings. Moreover, we have also evaluated the introduced overhead by measuring the time for getting the scheduling solutions for a different number of tasks, kinds of DAG, and resources, concluding that our method is suitable for run-time scheduling.This work has been supported by the Spanish Government (contracts TIN2015-65316-P, TIN2012-34557, CSD2007-00050, CAC2007-00052 and SEV-2011-00067), by Generalitat de Catalunya (contract 2014-SGR-1051), by the European Commission (Euroserver project, contract 610456) and by Consejo Nacional de Ciencia y Tecnología of Mexico (special program for postdoctoral position BSC-CNS-CONACYT contract 290790, grant number 265937).Peer ReviewedAward-winningPostprint (published version

    A Programming Model for Hybrid Workflows: combining Task-based Workflows and Dataflows all-in-one

    Full text link
    This paper tries to reduce the effort of learning, deploying, and integrating several frameworks for the development of e-Science applications that combine simulations with High-Performance Data Analytics (HPDA). We propose a way to extend task-based management systems to support continuous input and output data to enable the combination of task-based workflows and dataflows (Hybrid Workflows from now on) using a single programming model. Hence, developers can build complex Data Science workflows with different approaches depending on the requirements. To illustrate the capabilities of Hybrid Workflows, we have built a Distributed Stream Library and a fully functional prototype extending COMPSs, a mature, general-purpose, task-based, parallel programming model. The library can be easily integrated with existing task-based frameworks to provide support for dataflows. Also, it provides a homogeneous, generic, and simple representation of object and file streams in both Java and Python; enabling complex workflows to handle any data type without dealing directly with the streaming back-end.Comment: Accepted in Future Generation Computer Systems (FGCS). Licensed under CC-BY-NC-N
    • …
    corecore